AITopics | hardness measure

What makes math problems hard for reinforcement learning: a case study

Neural Information Processing SystemsJun-23-2026, 01:20:28 GMT

Using a long-standing conjecture from combinatorial group theory, we explore, from multiple perspectives, the challenges of finding rare instances carrying disproportionately high rewards. Based on lessons learned in the context defined by the Andrews-Curtis conjecture, we analyze how reinforcement learning agents handle problems of varying hardness. We also address many mathematical questions as a part of our study. Notably, we demonstrate the length reducibility of all but two presentations in the Akbulut-Kirby series (1981), and resolve various potential counterexamples in the Miller-Schupp series (1991), including three infinite subfamilies.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

5eeb693f46d753e5fe24c97212c22bd2-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 08:25:18 GMT

Second, weintroduceColosseum,apioneering package thatenables empirical hardness analysis and implements a principled benchmark composed of environments that are diverse with respect to different measures of hardness.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)

Add feedback

How hard is my MDP?" The distribution-norm to the rescue"

Neural Information Processing SystemsSep-30-2025, 09:31:30 GMT

In Reinforcement Learning (RL), state-of-the-art algorithms require a large number of samples per state-action pair to estimate the transition kernel $p$. In many problems, a good approximation of $p$ is not needed. For instance, if from one state-action pair $(s,a)$, one can only transit to states with the same value, learning $p(\cdot|s,a)$ accurately is irrelevant (only its support matters). This paper aims at capturing such behavior by defining a novel hardness measure for Markov Decision Processes (MDPs) we call the {\em distribution-norm}. The distribution-norm w.r.t.~a measure $\nu$ is defined on zero $\nu$-mean functions $f$ by the standard variation of $f$ with respect to $\nu$. We first provide a concentration inequality for the dual of the distribution-norm. This allows us to replace the generic but loose $||\cdot||_1$ concentration inequalities used in most previous analysis of RL algorithms, to benefit from this new hardness measure. We then show that several common RL benchmarks have low hardness when measured using the new norm. The distribution-norm captures finer properties than the number of states or the diameter and can be used to assess the difficulty of MDPs.

mdp, name change, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

Hardness in Markov Decision Processes: Theory and Practice

Neural Information Processing SystemsAug-15-2025, 05:20:03 GMT

Finally, we benchmark five tabular agents in our newly proposed benchmark.

agent, complexity, hardness, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

How hard is my MDP?" The distribution-norm to the rescue"

Odalric-Ambrym Maillard, Timothy A. Mann, Shie Mannor

Neural Information Processing SystemsFeb-8-2025, 21:53:44 GMT

In Reinforcement Learning (RL), state-of-the-art algorithms require a large number of samples per state-action pair to estimate the transition kernel p. In many problems, a good approximation of p is not needed. For instance, if from one state-action pair (s, a), one can only transit to states with the same value, learning p( |s, a) accurately is irrelevant (only its support matters). This paper aims at capturing such behavior by defining a novel hardness measure for Markov Decision Processes (MDPs) based on what we call the distribution-norm. The distributionnorm w.r.t. a measure ν is defined on zero ν-mean functions f by the standard variation of f with respect to ν. We first provide a concentration inequality for the dual of the distribution-norm.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > Austria > Styria > Leoben (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Add feedback

built from n i.i.d samples, we control ||p p

Neural Information Processing SystemsMar-13-2024, 07:02:13 GMT

In Reinforcement Learning (RL), state-of-the-art algorithms require a large number of samples per state-action pair to estimate the transition kernel p. In many problems, a good approximation of p is not needed. For instance, if from one state-action pair (s, a), one can only transit to states with the same value, learning p( |s, a) accurately is irrelevant (only its support matters). This paper aims at capturing such behavior by defining a novel hardness measure for Markov Decision Processes (MDPs) based on what we call the distribution-norm. The distributionnorm w.r.t. a measure ν is defined on zero ν-mean functions f by the standard variation of f with respect to ν. We first provide a concentration inequality for the dual of the distribution-norm.

algorithm, hardness, mdp, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > Austria > Styria > Leoben (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Add feedback

The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

Hase, Peter, Bansal, Mohit, Clark, Peter, Wiegreffe, Sarah

arXiv.org Artificial IntelligenceJan-12-2024

How can we train models to perform well on hard test data when hard training data is by definition difficult to label correctly? This question has been termed the scalable oversight problem and has drawn increasing attention as language models have continually improved. In this paper, we present the surprising conclusion that current language models often generalize relatively well from easy to hard data, even performing as well as "oracle" models trained on hard data. We demonstrate this kind of easy-to-hard generalization using simple training methods like in-context learning, linear classifier heads, and QLoRA for seven different measures of datapoint hardness, including six empirically diverse human hardness measures (like grade level) and one model-based measure (loss-based). Furthermore, we show that even if one cares most about model performance on hard data, it can be better to collect and train on easy data rather than hard data, since hard data is generally noisier and costlier to collect. Our experiments use open models up to 70b in size and four publicly available question-answering datasets with questions ranging in difficulty from 3rd grade science questions to college level STEM questions and general-knowledge trivia. We conclude that easy-to-hard generalization in LMs is surprisingly strong for the tasks studied, suggesting the scalable oversight problem may be easier than previously thought. Our code is available at https://github.com/allenai/easy-to-hard-generalization

generalization, hard data, hardness measure, (15 more...)

arXiv.org Artificial Intelligence

2401.06751

Country: North America > United States > New York (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Characterizing instance hardness in classification and regression problems

Torquette, Gustavo P., Nunes, Victor S., Paiva, Pedro Y. A., Neto, Lourenço B. C., Lorena, Ana C.

arXiv.org Artificial IntelligenceDec-4-2022

Some recent pieces of work in the Machine Learning (ML) literature have demonstrated the usefulness of assessing which observations are hardest to have their label predicted accurately. By identifying such instances, one may inspect whether they have any quality issues that should be addressed. Learning strategies based on the difficulty level of the observations can also be devised. This paper presents a set of meta-features that aim at characterizing which instances of a dataset are hardest to have their label predicted accurately and why they are so, aka instance hardness measures. Both classification and regression problems are considered. Synthetic datasets with different levels of complexity are built and analyzed. A Python package containing all implementations is also provided.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2212.01897

Country: South America > Brazil > São Paulo (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

PyHard – A tool to assess dataset quality and identify hard-to-classify instances in general - DataScienceCentral.com

#artificialintelligenceMar-13-2022, 03:47:11 GMT

Picture Courtesy: Freepik The article explains the algorithm behind the recently introduced Python package named PyHard, based on the concept of Instance Space Analysis. It helps in assessing the quality of a dataset and identifying what are the instances which are hard/easy to classify. With the help of this algorithm we can separate out noisy… Read More »PyHard – A tool to assess dataset quality and identify hard-to-classify instances in general

algorithm, dataset, hardness measure, (13 more...)

#artificialintelligence

Genre: Research Report (0.50)

Industry: Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

How hard is my MDP?" The distribution-norm to the rescue"

Maillard, Odalric-Ambrym, Mann, Timothy A., Mannor, Shie

Neural Information Processing SystemsFeb-14-2020, 08:42:42 GMT

In Reinforcement Learning (RL), state-of-the-art algorithms require a large number of samples per state-action pair to estimate the transition kernel $p$. In many problems, a good approximation of $p$ is not needed. For instance, if from one state-action pair $(s,a)$, one can only transit to states with the same value, learning $p(\cdot s,a)$ accurately is irrelevant (only its support matters). This paper aims at capturing such behavior by defining a novel hardness measure for Markov Decision Processes (MDPs) we call the {\em distribution-norm}. The distribution-norm w.r.t. a measure $ u$ is defined on zero $ u$-mean functions $f$ by the standard variation of $f$ with respect to $ u$.

concentration inequality, hardness measure, state-action pair, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.42)

Add feedback

Filters

Collaborating Authors

hardness measure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

What makes math problems hard for reinforcement learning: a case study

5eeb693f46d753e5fe24c97212c22bd2-Paper-Conference.pdf

How hard is my MDP?" The distribution-norm to the rescue"

Hardness in Markov Decision Processes: Theory and Practice

How hard is my MDP?" The distribution-norm to the rescue"

built from n i.i.d samples, we control ||p p

The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

Characterizing instance hardness in classification and regression problems

PyHard – A tool to assess dataset quality and identify hard-to-classify instances in general - DataScienceCentral.com

How hard is my MDP?" The distribution-norm to the rescue"